A Gradient-Based Metric Learning Algorithm for k-NN Classifiers
نویسندگان
چکیده
The Nearest Neighbor (NN) classification/regression techniques, besides their simplicity, are amongst the most widely applied and well studied techniques for pattern recognition in machine learning. A drawback, however, is the assumption of the availability of a suitable metric to measure distances to the k nearest neighbors. It has been shown that k-NN classifiers with a suitable distance metric can perform better than other, more sophisticated, alternatives such as Support Vector Machines and Gaussian Process classifiers. For this reason, much recent research in k-NN methods has focused on metric learning, i.e. finding an optimized metric. In this paper we propose a simple gradient-based algorithm for metric learning. We discuss in detail the motivations behind metric learning, i.e. error minimization and margin maximization. Our formulation differs from the prevalent techniques in metric learning, where the goal is to maximize the classifier’s margin. Instead our proposed technique (MEGM) finds an optimal metric by directly minimizing the mean square error. Our technique not only results in greatly improved k-NN performance, but also performs better than competing metric learning techniques. Promising results are reported on major UCIML databases.
منابع مشابه
A Simple Gradient-based Metric Learning Algorithm for Object Recognition
The Nearest Neighbor (NN) classification/regression techniques, besides their simplicity, is one of the most widely applied and well studied techniques for pattern recognition in machine learning. Their only drawback is the assumption of the availability of a proper metric used to measure distances to k nearest neighbors. It has been shown that K-NN classifier’s with a right distance metric can...
متن کاملData Dependent Distance Metric for Efficient Gaussian Processes Classification
The contributions of this work are threefold. First, various metric learning techniques are analyzed and systematically studied under a unified framework to highlight the criticality of data-dependent distance metric in machine learning. The metric learning algorithms are categorized as naive, semi-naive, complete and high-level metric learning, under a common distance measurement framework. Se...
متن کاملNon-Euclidean or Non-metric Measures Can Be Informative
Statistical learning algorithms often rely on the Euclidean distance. In practice, non-Euclidean or non-metric dissimilarity measures may arise when contours, spectra or shapes are compared by edit distances or as a consequence of robust object matching [1,2]. It is an open issue whether such measures are advantageous for statistical learning or whether they should be constrained to obey the me...
متن کاملComparison of Distance Metrics for Phoneme Classification based on Deep Neural Network Features and Weighted k-NN Classifier
K-nearest neighbor (k-NN) classification is a powerful and simple method for classification. k-NN classifiers approximate a Bayesian classifier for a large number of data samples. The accuracy of k-NN classifier relies on the distance metric used for calculating nearest neighbor and features used for instances in training and testing data. In this paper we use deep neural networks (DNNs) as a f...
متن کاملIdentification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network
Because of the existing interactions among the variables of a multiple input-multiple output (MIMO) nonlinear system, its identification is a difficult task, particularly in the presence of uncertainties. Cement rotary kiln (CRK) is a MIMO nonlinear system in the cement factory with a complicated mechanism and uncertain disturbances. The identification of CRK is very important for different pur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010